Incremental Parser Generation for Tree Adjoining Grammars

نویسنده

  • Anoop Sarkar
چکیده

This paper describes the incremental generation of parse tables for the LRtype parsing of Tree Adjoining Languages (TALs). The algorithm presented handles modi cations to the input grammar by updating the parser generated so far. In this paper, a lazy generation of LR-type parsers for TALs is de ned in which parse tables are created by need while parsing. We then describe an incremental parser generator for TALs which responds to modi cation of the input grammar by updating parse tables built so far. 1 LR Parser Generation Tree Adjoining Grammars (TAGs) are tree rewriting systems which combine trees with the single operation of adjoining. (Schabes and Vijay-Shanker, 1990) describes the construction of an LR parsing algorithm for TAGs. Parser generation here is taken to be the construction of LR(0) tables (i.e., without any lookahead) for a particular TAG. The moves This work is partially supported by NSF grant NSFSTC SBR 8920230 ARPA grant N00014-94 and ARO grant DAAH04-94-G0426. Thanks to Breck Baldwin, Dania Egedi, Jason Eisner, B. Srinivas and the three anonymous reviewers for their valuable comments. Familiarity with TAGs and their parsing techniques is assumed throughout the paper, see (Schabes and Joshi, 1991) for an introduction. We assume that our de nition of TAG does not have the substitution operation. See (Aho et al., 1986) for details on LR parsing. The algorithm described here can be extended to use SLR(1) tables (Schabes and Vijay-Shanker, 1990). made by the parser can be explained by an automaton which is weakly equivalent to TAGs called Bottom-Up Embedded Pushdown Automata (BEPDA) (Schabes and Vijay-Shanker, 1990). Storage in a BEPDA is a sequence of stacks, where new stacks can be introduced above and below the top stack in the automaton. Recognition of adjunction is equivalent to the unwrap move shown in Fig. 1.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Negotiation Strategies in an Integrated Natural Language Generation System

This paper describes negotiation strategies in an integrated generation system (INLGS) based on the formalism of Schema–Tree Adjoining Grammars with Unification (SU–TAGs). Integrated or uniformmeans that all knowledge bases are specified in the same formalism and run the same processing algorithm. In our project a reversible parser/generator runs knowledge bases in the formalism of Schema– Tree...

متن کامل

Parsing with Underspecifications

This paper describes a direct parser for Schema–Tree Adjoining Grammars (S–TAG) which explores schemata, i.e. underspecified elementary rules. Basically, a schema in a S–TAG represents a possibly infinite set of elementary rules by folding up all actual substructures and depicting them in terms of a regular expression (RX). Hence, S–TAGs provide a more condensed grammar representation. In the f...

متن کامل

TuLiPA - Parsing Extensions of TAG with Range Concatenation Grammars

In this paper we present a parsing framework for extensions of Tree Adjoining Grammars (TAG) called TuLiPA (Tübingen Linguistic Parsing Architecture). In particular, besides TAG, the parser can process Tree-Tuple MCTAG with shared nodes (TT-MCTAG), a TAG-extension that has been proposed to deal with scrambling in free word order languages such as German. The central strategy of the parser is su...

متن کامل

Licensing and Tree Adjoining Grammar in Government Binding Parsing

This paper presents an implemented, psychologically plausible parsing model for Government Binding theory grammars. I make use of two main ideas: (1) a generalization of the licensing relations of [Abney, 1986] allows for the direct encoding of certain principles of grammar (e.g. Theta Criterion, Case Filter) which drive structure building; (2) the working space of the parser is constrained to ...

متن کامل

Fast LR parsing Using Rich (Tree Adjoining) Grammars

We describe an LR parser of parts-ofspeech (and punctuation labels) for Tree Adjoining Grammars (TAGs), that solves table conflicts in a greedy way, with limited amount of backtracking. We evaluate the parser using the Penn Treebank showing that the method yield very fast parsers with at least reasonable accuracy, confirming the intuition that LR parsing benefits from the use of rich grammars.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996